
% Office I 




IN\ ESTOR IN PEOPLE 



The Patent Office 
Concept House 



Cardiff Road 
Newport 



South Wales 
NPIO 8QQ 



I, the undersigned, being an officer duly authorised in accordance with Section 74(1) and (4) 
of the Deregulation & Contracting Out Act 1994, to sign and issue certificates on behalf of the 
Comptroller-General, hereby certify that annexed hereto is a true copy of the documents as 
originally filed in connection with the patent application identified therein. 



In accordance with the Patents (Companies Re-registration) Rules 1982, if a company named 
in this certificate and any accompanying documents has re-registered under the Companies Act 
1980 with the same name as that with which it was registered immediately before re- 
registration save for the substitution as, or inclusion as, the last part of the name of the words 
"public limited company" or their equivalents in Welsh, references to the name of the company 
in this certificate and any accompanying documents shall be treated as references to the name 
with which it is so re-registered. 

In accordance with the rules, the words "public limited company" may be replaced by p. I.e., 
pic, P.L.C. or PLC. 

Re-registration under the Companies Act does not constitute a new legal entity but merely 
subjects the company to certain additional company law rules. 




Signed 



Dated 21 December 2000 




An Executive Agency of the Department of Trade and Industry 



2mm E494797-2 D0261U. 
FOl/7700 0,00-9927907.7 



The Patent Office 
Cardiff Road 
Newport 

Gwent NP9 1RH 



1. Your reference 

2641701/NF 



2. Patent Application Number 

99279 07.7 

3. Full name, address and postcode of the or of each applicant (underline all surnames) 

Canon Kabushiki Kaisha 
30-2 3-Chome Shimomaruko 
Ohta-Ku 
Tokyo 

Japan "7^Q^o( OO^S 




Patents Form 1/77 
Patents Act 1977 ''^ 

(Rule 16) 



Request for grant ofaTpatent 



The 
Patent 
Office 



Patents ADP number (if known) 

If the applicant is a corporate body, give the Country: JAPAN 

country/state of its incorporation State: 



4. Title of the invention 

IMAGE PROCESSING METHOD AND APPARATUS 



5. 


Name of agent 


Beresford & Co 




"Address for Service" in the United Kingdom 
to which all correspondence should be sent 


2/5 Warwick Court 
High Holborn 
London WCIR 5DJ 




Patents ADP number 




6. 


Priority details 






Country Priority application number 


Date of filing 



Patents Form 1/77 



7. If this application is divided or otherwise derived from an earlier UK application give details 
Number of earlier application Date of filing 



8. Is a statement of inventorship and or right to grant of a patent required in support of this 
request? 

YES 



9, Enter the number of sheets for any of the following items you are filing with this form. 

Continuation sheets of this form 
Description 97 
CIaim(s) 17 
Abstract 1 
Drawing(s) 17 ^ | 

10. If you are also filing any of the following, state how many against each item. 

Priority documents 

Translations of priority documents 

Statement of inventorship and 

right to grant of a patent (Patents form 7/77) YES 2+2 

Request for preliminary examination 
and search (Patents Form 9/77) 

Request for Substantive Examination 
(Patents Form 20/77) 

Any other documents 
(please specify) 



11, lAVe request the grant of a patent on the basis of this application 




Signature /-^^^[Z>^U ^ ^ Date 25 November 1999 

BERESFORD & Co 



12. Name and daytime telephone number of 
person to contact in the United Kingdom 



NICHOLAS FOX 
Tei:0171-831-2290 



Patents Form 7/77 
Patents Act 1977 

(Rule 15) 




Statement of inventorship and of The patem office 

right to grant of a patent Cardiff Road 



Newport 
Gwent NP9 1RH 



1. 


Your reference 




2641701/NF 


2. 


Patent Application Number 9927907 7 


3. 


Full name of the or each applicant 




Canon Kabushiki Kaisha 


4. 


Title of the invention 




IMAGE PROCESSING METHOD AND APPARATUS 


5. 


State how the appUcant(s) derived the right from the inventor(s) to be granted a patent 



By virtue of the employment of the inventor by Canon Research Centre Europe Ltd, and by virtue 
of an agreement between Canon Research Centre Europe Ltd and Canon Kabushiki Kaisha dated 
1 January 1994. 



6. How many, if any additional Patents Forms 

7/77 are attached to this form? 



7. lAVe believe that the person(s) named over the page (and on any extra copies of this form) is/are the 

inventor(s) of the invention which the above patent application relates to. 



Signature 




Date 25 November 1999 



BERESFORD & Co 



8. 



Name and daytime telephone number of 
person to contact in the United Kingdom 



NICHOLAS FOX 
Tel: 0171-831-2290 



Patents Form 7/77 



DR ADAM MICHAEL BAUMBERG 

c/o Canon Research Centre Europe Ltd 
1 Occam Court 
Occam Road 

Surrey Research Park 

Guildford . 
surrey GU2 5YJ ^ ^ ( ^ OO I 



DUPLICATE 



2641701 



IMAGE PROCESSING METHOD AND APPARATUS 

The present invention relates to the detection and 
matching of features in images. The present invention 
may be used to match features in different images. 
Alternatively, the invention may be used to identify 

features in images for the purpose of for example 

indexing or categorisation. 

The present invention ■i^particularl suitable for the 
identification of points^^thin images corresponding to 
the same physical point an object seen from two 
viewpoints. By identifWg, Poi^^s within images 
corresponding to the same physical point on an object, it 
is possible to establish t;h^^j^elative positions from 
which image data has been obtained. The image data can 
then be used to generate a thr%-dimensional model of the 
object appearing in the images. 

The appearance of an object in an image can change in a 
number of ways as a result of changes of camera 
viewpoint. If points in images taken from different 
camera viewpoints are to be matched, it is therefore 
necessary to characterize points within images in a way 
which is not affected by the introduced distortion so 
that matching is possible. 
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A number of ways of characterizing features in images 
have been suggested. One example is the use of 
rotational invariants suggested by Gouet et al, m "A 
Fast Matching Method for Colour Uncalibrated Images Using 
Differential Invariants" British Machine Vision 
conference, 98 Volume 1, page 367-376. This suggests 
characterizing feature points in images using 
differential texture invariants which are invariant under 
rotation. In this way, rotation of a camera may be 
accounted for. Furthermore, small variations in camera 
position give rise to distortions which may be 
approximated as rotations and hence the use of rotational 
invariants is also suitable to account for some other 
distortions . 

However, some changes of viewpoint give rise to 
distortions which cannot be approximated to rotations. 
The matching of feature points in such images may 
therefore be unsatisfactory. 

In one aspect, the present invention aims to provide an 
apparatus which more accurately matches feature points in 
images of the same object taken from different 
viewpoints . 

In accordance with one aspect of the present invention 
there is provided an apparatus for matching features in 
images comprising: 
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input means for receiving image data; 

characterization means for characterizing points 
within images corresponding to received image data; and 

matching means for matching points within image data 
on the basis of the characterization of points 
characterized by said characterization means, 
characterized in that: 

said characterization means is arranged to 
characterize points within images, wherein said 
characterization is substantially unaffected by af f ine 
distortions of a portion of an image centred on said 
feature point. 

When images of planar surfaces are taken from different 
positions relative to the surface, the surfaces appear to 
undergo affine transformations. By providing an 
apparatus which characterizes portions of images in a way 
which is substantially unaffected by affine distortions 
the matching of points on planar surfaces of objects in 
images taken from different view points can be improved. 

Another embodiment of the present invention comprises an 
apparatus for comparing an image against a database of 
images utilizing apparatus for matching feature points in 
the images, as has been described above. 

Further aspects and embodiments of the present invention 
will become apparent when reading the following 
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description with reference to the accompanying drawings 
in which: 

Figure 1 is a block diagram of a modular system for 
generating three-dimensional computer models from xmages 
of objects in which the present invention may be 
embodied; 

Figures 2A and 2B are a pair of illustrative examples of 
images of an object taken from two different viewpoints; 

Figures 3 and 4 are a further pair of illustrative 
examples illustrating the effect of changing camera 
viewpoint; 

Figure 5 is a block diagram of a feature detection and 
matching module in accordance with the first embodiment 
of the present invention; 

Figure 6 is a flow diagram of the processing of the 
control module program of the feature detection and 
matching module of Figure 5; 

Figures 7A and 7B are a flow diagram of the processing of 
data in accordance with the detection module program of 
the feature detection and matching module of Figure 5; 



Figure 8 is a flow diagram of the processing 
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characterization module of the feature detection and 
matching module of Figure 5; 

Figures 9A, 9B and 9C are a flow diagram of the 
calculation of rotational invariants by the 
characterization module; 

Figures 10, 11, 12A, 12B, 13A and 13B are illustrative 
examples of the distribution of scaling factors used in 
scaling masks to calculate approximations of complex 
coefficients for the calculation of rotation invariants; 

Figure 14 is a flow diagram of the processing of the 
matching module of the feature detection and matching 
module of Figure 5; 

Figure 15 is a block diagram of an apparatus for 
retrieving images from a database of images utilizing a 
characterization and matching module in accordance with 
a third embodiment of the present invention; and 

Figure 16 is a block diagram of an apparatus for 
generating images in which the effects of stretch and 
skew resulting from affine transformations of an image 
are removed in accordance with a fifth embodiment of the 
present invention. 
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FIRST EMBODIMENT 

Figure 1 schematically shows the components of a modular 
system in which the present invention may be embodied. 
These components can be effected as processor-implemented 
5 instructions, hardware or a combination thereof. 

Referring to Figure 1, the components are arranged to 
process data defining images (still or moving) of one or 
more objects in order to generate data defining a three- 
10 dimensional computer model of the object:(s). 

The input image data may be received in a variety of 
ways, such as directly from one or more digital cameras, 
via a storage device such as a disk or CD ROM, by 
15 digitisation of photographs using a scanner, or by 
downloading image data from a database, for example via 
a data link such as the Internet, etc. 

The generated 3D model data may be used to; display an 
20 image of the object (s) from a desired viewing position; 
control manufacturing equipment to manufacture a model of 
the object(s), for example by controlling cutting 
apparatus to cut material to the appropriate dimensions; 
perform processing to recognise the object(s), for 
25 example by comparing it to data stored in a database; 
carry out processing to measure the object(s), for 
example by taking absolute measurements to record the 
size of the object(s), or by comparing the model with 
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models of the object(s) previously generated to determine 
changes therebetween; carry out processing so as to 
control a robot to navigate around the object (s); store 
information in a geographic information system (GIS) or 
other topographic database; or transmit the object data 
representing the model to a remote processing device for 
any such processing, either on a storage device or as a 
signal (for example, the data may be transmitted in 
virtual reality modelling language (VRML) format over the 
internet, enabling it to be processed by a WWW browser); 
etc. 

The feature detection and matching module 2 is arranged 
to receive image data recorded by a still camera from 
different positions relative to the object(s) (the 
different positions being achieved by moving the camera 
and/or the object(s)) or frames from a video camera, 
where there is an interruption and change of view point 
within a stream of video images such as arises when a 
user switches off a video camera and restarts filming an 
object from a different position. The received data is 
then processed in order to match features within the 
different images (that is, to identify points in the 
images which correspond to the same physical point on the 
object(s) ) . 

The feature detection and tracking module 4 is arranged 
to receive image data recorded by a video camera as the 
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relative positions of the camera and object(s) are 
changed (by moving the video camera and/or the 
object(s)). As in the feature detection and matching 
module 2, the feature detection and tracking module 4 
detects features, such as corners, in the images. 
However, the feature detection and tracking module 4 then 
tracks the detected features between frames of image data 
in order to determine the positions of the features in 
other images. 

The camera position calculation module 6 is arranged to 
use the features matched across images by the feature 
detection and matching module 2 or the feature detection 
and tracking module 4 to calculate the transformation 
between the camera positions at which the images were 
recorded and hence determine the orientation and position 
of the camera focal plane when each image was recorded. 

The feature detection and matching module 2 and the 
camera position calculation module 6 may be arranged to 
perform processing in an iterative manner. That is, 
using camera positions and orientations calculated by the 
camera position calculation module 6, the feature 
detection and matching module 2 may detect and match 
further features in the images using Epipolar geometry in 
a conventional manner, and the further matched features 
may then be used by the camera position calculation 
module 6 to recalculate the camera positions and 



2641701 



orientations . 

If the positions at which the images were recorded are 
already known, then, as indicated by arrow 8 in Figure 1, 
the image data need not be processed by the feature 
detection and matching module 2, the feature detection 
and tracking module 4 , or the camera position calculation 
module 6. For example, the images may be recorded by 
mounting a number of cameras on a calibrated rig arranged 
to hold the cameras in known positions relative to the 
object ( s ) - 

Alternatively, it is possible to determine the positions 
of a plurality of cameras relative to the object (s) by 
adding calibration markers to the object(s) and 
calculating the positions of the cameras from the 
positions of the calibration markers in images recorded 
by the cameras . The calibration markers may comprise 
patterns of light projected onto the object(s). Camera 
calibration module 10 is therefore provided to receive 
image data from a plurality of cameras at fixed positions 
showing the object(s) together with calibration markers, 
and to process the data to determine the positions of the 
cameras o A preferred method of calculating the positions 
of the cameras (and also internal parameters of each 
camera, such as the focal length etc) is described in 
"Calibrating and 3D Modelling with a Multi-Camera System" 
by Wiles and Davison in 1999 IEEE Workshop on Multi-View 
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Modelling and Analysis of Visual Scenes, ISBN 0769501109o 

The 3D object surface generation module 12 is arranged to 
receive image data showing the object(s) and data 
defining the positions at which the images were recorded, 
and to process the data to generate a 3D computer model 
representing the actual surface(s) of the object(s), such 
as a polygon mesh model. 

The texture data generation module 14 is arranged to 
generate texture data for rendering onto the surface 
model produced by the 3D object surface generation module 
12 o The texture data is generated from the input image 
data showing the object(s). 

Techniques that can be used to perform the processing in 
the modules shown in Figure 1 are described in EP-A- 
0898245, EP-A-0901105, pending US applications 09/129077, 
09/129079 and 09/129080, the full contents of which are 
incorporated herein by cross-reference, and also Annex A. 

The present invention may be embodied in particular as 
part of the feature detection and matching module 2 
(although it has applicability in other applications, as 
will be described later) . 

Prior to describing in detail a feature detection and 
characterization module 2 in accordance with a first 
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embodiment of the present invention, the problems of 
accurately matching points within images of an object 
seen from different viewpoints arising due to the 
differences in appearance resulting from a change of view 
point of an object will briefly be discussed. 

Figures 2A and 2B are illustrative examples of two images 
recorded by a still camera from different positions 
relative to the same object- In this example the image 
20 of Figure 2A comprises an image of a house 22 as 
viewed from in front. In the image can be seen four 
windows 24,26,28,30, a front door 32 and a chimney 34. 
Next to the house to the right of the house there is a 
flower 36. 

The image 40 of Figure 2B comprises an image of the same 
house 42 taken from a viewpoint to the left of the 
position in which the first image 20 has been taken. 
Again visible in the image are four windows 44,46,48,50, 
a front door 52 and a chimney 54. A flower 56 is also 
visible to the right of the house 42. 

As an initial step for establishing the relative camera 
positions between two images of the same object, it is 
necessary to establish which points in the images 
correspond to the same physical points of the objects 
appearing within the images. Where a sequence of images 
are taken with a video camera the differences between 
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consecutive images, unless there is an interruption in 
the video image stream, are usually very small. It is 
therefore possible, provided there has been no 
interruption in the video image stream, to constrain the 
5 search for points in images which correspond to the same 
physical point on an object to a small area in the same 
region of a second image and then determine the effect of 
moving the camera in terms of a translation applied to 
pixels within that portion of the image. 

10 

In contrast, where a still camera is used to obtain image 
data of objects from different viewpoints or where a 
video camera has been switched off between two image 
frames in the video stream the difference between the 

15 view point in two images can be much larger. As the 
difference in viewpoints increases it is no longer 
adequate to assume that the change in viewpoint can be 
approximated as a translation of portions of an image 
since in addition to translation the parts of an image 

20 are also distorted as a result of the change of view 
point . 

Thus for example looking at the exemplary images of 
Figures 2A and 2B it is apparent that the square windows 
25 24,26,28,30 appearing in the image 20 of Figure 2A are 
stretched and skewed so as to appear as parallelograms 
44,46,48,50 in the image 40 of Figure 2B. This is in 
addition to the windows 44,46 on the left hand side of 
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the house being translated further down the image and the 
window 48,50 on the right hand side of the house being 
translated up in the image 40 of Figure 2B compared to 
the same windows 24-30 in the image 20 of Figure 2A. 
Furthermore, in contrast to the appearance in the image 
20 of Figure 2A, in the image 40 of Figure 2B because the 
windows 44,46 on the left hand side of the house are now 
closer to the camera than the windows 4 8,50 on the right 
hand side of the house, the relative proportions of the 
windows has changed with the windows 44,46 on the left 
hand side of the house in the second image 40 being 
larger than the windows 48,50 on the right hand side of 
the house. 

Since the appearance of an object can change 
significantly it is necessary to identify characteristics 
of an image which are not affected by the distortions 
resulting from a change of viewpoint. By characterizing 
points within an image which are not significantly 
affected by the distortions of the appearance of an image 
resulting from changes in camera position, it is possible 
to use the characterization of an image to establish 
which points within pairs of image correspond to the same 
physical points on an object. 

Figures 3 and 4 are two further exemplary images to 
illustrate a further problem with the matching of the 
points in images corresponding to the same physical 
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points on an object. One of the problems of matching 
feature points in images of an object arises from the 
possibility that an object in one image may appear as a 
smaller or larger object in another image due to the fact 
5 that the two images have been taken from camera positions 
further or closer to an object. 

Figure 3 is an image showing a building block 100 in the 
foreground of a window 102 in the background with a 
10 landscape 104 visible through the window. The window 
panes of the window 102 form a cross at the centre of the 
window where they meet. 

Figure 4 is an example of an image of the same scene 
15 taken from the camera viewpoint much closer to the 
building block 100. In the image of Figure 4 the 
building block 100 appears to be much larger than it does 
in the image of Figure 3 . 

20 The possibility that objects may appear to be of 
different sizes in different images due to a change of 
camera viewpoint gives rise to two separate problems when 
attempting to establish correspondence between points in 
one image and points in another image . 

25 

The first problem arising from changes of camera 
viewpoint that may cause a change of scale is that a 
change of scale may cause different points of interest to 
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be selected for future characterization, thus making 
future matching impossible. This problem arises because 
some large scale features such as the cross at the centre 
of the image of Figure 3 may only become apparent when a 
large area of an image is considered. However, if only 
large areas of images are considered for the detection of 
points of interest, smaller feature points such as the 
corners of the building block 100 as it appears in Figure 
3 may be overlooked. However, where changes of scale are 
likely to occur it is necessary that both large and small 
features are detected since these may subsequently appear 
as small or large features in future images. Thus for 
example the small feature that appears as the corner of 
the building block 100 in the image of figure 3 appears 
as a far larger feature in the image of Figure 4 . 

The second problem arising due to changes of scale arises 
after a selection of features of interest has been made. 
When feature points of interest have been selected, the 
features need to be characterized so that matching may 
occur. If features appearing as a large feature in one 
image are to be matched with the features which appear as 
a small feature in another image, it can be important to 
account for the fact that the features appear at 
different sizes as the characterization of a feature may 
vary due to the apparent size of the feature in an image. 
If no allowance is made for the possibility that the same 
feature may appear at different scales in different 
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images when characterizing features the characterization 
of an image feature may be dependent on the size at which 
it appears and hence matching different sized 
representations of the same image may be impossible using 
such characterizations . 

The present embodiment includes a feature matching and 
detection module 2 which provides a number of means by 
which differences in images arising from a change of 
camera viewpoint can be accounted for and hence enabling 
matching of features appearing in images taken from 
spaced view points to be facilitated as will now be 
described, 

FEATURE DETECTION AND MATCHING MODULE 

Figure 5 is a block diagram of a feature detection and 
matching module 2 in accordance with the first embodiment 
of the present invention. The feature detection and 
matching module 2 in this embodiment is arranged to 
receive grey scale image data recorded by a still camera 
from different positions relative to an object or video 
image data where an interruption in a video stream has 
occurred and filming has restarted from a different 
position and to output a list of pairs of co-ordinates of 
points in different images which correspond to the same 
physical point of the object appearing in the images. 
The list of pairs of co-ordinates can then be used by the 
camera position calculation module 6 to determine the 
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orientation and position of the camera focal plane when 
each image was recorded. In this embodiment the feature 
matching and detection module 2 is arranged to perform 
processing iteratively with the camera position 
calculation module 6 to match image feature points 
utilizing calculated camera positions and then refine 
calculated camera positions on the basis of those matched 
feature points . 

The feature detection and matching module 2 comprises an 
image buffer 6 0 for receiving grey scale image data, 
comprising pixel data for images, and camera position 
data from the camera position calculation module. The 
image buffer 60 is connected to an output buffer 62 via 
a central processing unit (CPU) 64 which is arranged to 
process the image data stored in the image buffer 60 to 
generate a list of matched points output to the output 
buffer 62. The processing of image data by the CPU is in 
accordance with a set of programs stored within a read 
only memory (ROM) 66 which is connected to the CPU 64. 
In this embodiment the feature detection and matching 
module 2 is arranged to receive and process images of 768 
by 576 pixels. 

The programs stored in the ROM 66 comprise a control 
module 70 for coordinating the overall processing of the 
programs stored in the ROM 66, a detection module 72 for 
identifying features to be matched between images, a 
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characterization module 74 for characterizing the 
features detected by the detection module 72 and a 
matching module 76 for matching features detected by the 
detection module 72 on the basis of the characterization 
5 of those features by the characterization module 74. 

The CPU 64 is also connected to a random access memory 
(RAM) 78 which is used for the storage of variables 
calculated in the course of detecting features in images, 
10 characterizing those features and matching them to 
generate an output list of matched points between pairs 
of images • 

Figure 6 is a flow diagram of the control module program 
15 70 for coordinating the flow of control of the processing 
of data by the feature detection and matching module 2. 
Initially the control module 70 waits until image data is 
received (SI) and stored in the image buffer 60. This 
causes the control module 70 to invoke the detection 
20 module 72 to analyse the image data stored in the image 
buffer 60 to ascertain (S2) a number of feature points 
within the images stored in the image buffer 60 which are 
to be further processed to determine whether they can be 
matched as correspond to the same physical point on an 
25 object in two images stored within the image buffer 60 as 
will be described in detail later. The co-ordinates of 
the potential feature points of interest detected in the 
images stored in the image buffer 60 are then stored in 
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RAM 78 together with other data relating to the feature 
points for use in the subsequent processing by the CPU 64 
as will be described later. 

When the feature points for a pair of images have been 
determined and stored in RAM 7 8 the control module 70 
then invokes the characterizing module 74 to characterize 
{S3) each of the detected feature points using portions 
of the images around detected feature points as will be 
described in detail later. Data representative of the 
characterization of each of the feature points is then 
stored in RAM 78 so that it may be used to match points 
in different images as corresponding to the same physical 
point in an object appearing in the images* 

When all of the feature points in a pair of images have 
been characterized by the characterization module 74 the 
control module 70 then invokes the matching module 76 to 
match (S4) the feature points characterized by the 
characterization module 74 in different images as 
corresponding to the same physical point on an object on 
the basis of the characterization data stored in RAM 78. 

After the matching module 76 has determined the best 
matches for feature points characterized by the 
characterization module 74 the control module 70 causes 
a list of pairs of matched feature points to be output 
(85) to the output buffer 62. 
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FEATURE DETECTION 

The detection module 72 is arranged to process image data 
stored in the image buffer 60 to select a number of 
feature points which are candidates for matching by the 
characterization module 74 and the matching module 76. 

As part of the processing of image data to select feature 
points, the detection module 72 is arranged to generate 
smoothed image data by averaging values across a number 
of pixels to eliminate small features and to calculate 
feature strength values indicating the presence of 
features utilizing only limited areas of a smoothed image 
to eliminate large features. By linking these processes 
to a scaling factor and processing the image data for 
each of a predefined set of scaling factors, features of 
different sizes are detected and assigned feature 
strengths. In order that comparisons of feature strength 
can be made regardless of the scale factor which was used 
in the process to detect a feature, these feature 
strength values are calculated utilizing the selected 
scale factor to enable comparison of the strengths of 
features of different sizes as will now be described. 

Figures 7A and 7B are a flow diagram of the processing of 
data in accordance with the detection module 72 stored in 
ROM 66. In this embodiment of the present invention the 
feature points of images stored within the image buffer 
60 are selected on the basis of processing the image data 
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to detect points within the images representative of 
corners on objects within the images. 

Initially (SIO) the detection module 72 causes the CPU 64 
5 to calculate a smoothed set of image data based on the 
image data stored in the image buffer 60. In order to 
calculate a grey scale value for each pixel in the 
smoothed image, the sum of the grey scale pixel values 
of a region of the image centred on corresponding pixels 
10 in the image data is determined where the contribution of 
each pixel in that region of the image is scaled in 
accordance with a Gausian function G(x,y) where: 



G(xj^)=^exp 



15 where x and y are the relative x & y coordinates of a 
pixel relative to the pixel for which a value in the 
smoothed image is to be calculated and is the first of 
the set of scale factors stored in memory* In this 
embodiment the detection module 72 is arranged to detect 

20 features using a stored set of scale factors comprising 
the values of 0.5, 0.707, 1,414, 2, 2.828 and 4 with the 
first scale factor being 0.5. Each of the scale factors 
is associated with stored window size of square regions 
for calculating smoothed images and averaged second 

25 moment matrices at an associated scale as will now be 
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described. 

By calculating a smoothed image from the image data 
stored in the image buffer 60 a set of image data is 
obtained where the values for pixels in the smoothed 
image are dependent upon regions within the image. This 
has the effect of eliminating from the image data 
representing very small features which might otherwise be 
detected as a corner in the future processing of the 
image . 

The scale at which an image is smoothed determines the 
extent to which the pixel value for a pixel in the 
smoothed image is determined by neighbouring pixels. 
Where a small value is selected for a^, the effect of 
scaling is such that the contribution of other pixels 
reduces rapidly as the pixels get further away. Thus the 
value for a corresponding pixel in the smoothed xmage may 
be determined by only considering a small region of image 
data centred on a pixel with the contribution of pixels 
outside of that region being ignored. In contrast, for 
larger values the contribution of more distant pixels 
in the image data is more significant. It is therefore 
no longer appropriate to ignore the contributions of 
these more distant pixels. A larger number of pixels in 
the image data must therefore be considered for the 
calculation of pixel values in a smoothed image at such 
larger scale. 
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Thus in this embodiment of the present invention when 
calculating a smoothed image at a scale associated with 
a small value of a 3 x 3 region of pixels centred on 
a pixel in the original image is used to determine a 
value of the corresponding pixel in the smoothed image. 
For larger values of progressively larger square 
regions are used with the size of the region being 
selected so that the scaling for those pixels whose 
contribution is not calculated is less than a threshold 
value for example e"^ As stated previously each of 
these window sizes is stored in association with a scale 
factor and utilised automatically when the associated 
scale factor is utilised to generate a smoothed image. 

When a smoothed image has been calculated and stored in 
memory 7 8 the detection module 72 then causes (S12) the 
CPU 64 to calculate for each pixel in the smoothed image 
a second moment matrix M where: 

/I ' I I \ 

M = 

T T I 

V^x^y ^y / 

where and ly are derivatives indicative of the rate of 
change of grey scale pixel values for pixels in the 
smoothed image along x and y coordinates respectively 
calculated in a conventional manner by determining the 
difference between grey scale values for adjacent pixels. 
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The calculated values for the second moment matrices for 
each of the pixels in the smoothed image are then stored 
in the memory 78 for future processing* 

5 The detection module 72 then causes (S14) an averaged 
second moment matrix for each of the pixels in a region 
to be calculated by the CPU 64. These averaged second 
moment matrices are calculated in a similar manner to the 
calculation of the smoothed image in that the averaged 
10 second memory matrix for a pixel is calculated from the 
sum of the second moment matrices for pixels in a square 
region centred on a selected pixel scaled by a scaling 
factor G(x,y) where: 



2 2 

15 G(x.y)^exp Z^L^ 



where x and y are the relative x & y coordinates of a 
pixel in a square region centred on the pixel for which 
an averaged second moment matrix image is to be 
calculated and Ot is a scale factor selected from a stored 
20 set of scale factors. 

As has previously been stated in relation to the 
calculation of a smoothed image from received image data 
since the scale selected for an averaging operation 
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determines the rate at which contributions from 
surrounding pixels declines, the selected scale also 
determines size of the region centred on a pixel which is 
relevant for determining the average as the scaled 
5 contribution of more distant pixels ceases to be of 
importance. Thus as in the case of the calculation of 
the smoothed image only a limited number of second moment 
matrices for pixels adjacent to a selected pixel need to 
be determined with those pixels whose contribution scaled 
10 by a factor of less than a threshold value, in this 
embodiment e"® being ignored. 

In this embodiment of the present invention the scale 
at which second moment matrices in a region are 

15 determined is set to be equal to 2 . In this way the 
value determined for an averaged second moment matrix 
centred on a pixel is determined on the basis of the 
second moment matrices for pixels in a square region 
whose size is dependent on the value of which is 

20 selected. Similarly the size of a region is selected by 
utilising a window size stored in association with a 
scale factor, which is twice the size of the window size 
used for generating a smoothed image with the same 
associated scale factor, 

25 

The combined effect of the smoothing operation to 
generate smoothed image and the subsequent averaging 
operation to calculate an averaged second moment matrix 
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is to restrict the size of features which are detected by 
the detection module 72. Both operations, since they 
involve the determining of a calculated value for a pixel 
utilizing a region of an image act to eliminate the 
5 effect of small features whose effect is spread by the 
averaging process. However, since both processes only 
calculate values for pixels based on fixed regions of 
image data, features in the original image which are only 
apparent when larger regions of image data are considered 
10 will also be effectively filtered by the detection module 
72. Thus the averaged second moment matrices calculated 
for each pixel are representative of features in the 
original image, which have a size lying within a range 
defined by Og. 

15 

For each of the pixels for which an averaged second 
moment matrix has been calculated a normalised corner 
strength is then determined (S16) by the detection module 
72 o In this embodiment the normalised corner strength 
20 comprises a calculated value for a Harris corner detector 
scaled by Og"'*, The normalised corner strength for a pixel 
is calculated using the following equation: 

1 2 
NormalisedCornerStrength = — [detM^ ~ 0.04 ^(traceM^) ] 

25 ^5 

where M^^ is the averaged second moment matrix calculated 
for a pixel. 
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The calculated normalised corner strength for a pixel, 
the average second moment matrix and the co-ordinate of 
the pixel are then stored (S18) in memory 78- In this 
embodiment the normalised corner strength is used for 
5 selecting feature points for further characterization as 
will be described later. The averaged second moment 
matrix is used in the subsequent processing of selected 
feature points as will also be described later. By 
storing the value of the averaged second moment matrix 
10 the necessity of having to recalculate this matrix 
subsequently is avoided. 

By calculating the normalised corner strength in the 
manner described above the calculated normalised corner 

15 strength is independent of the values selected for o^^ 
since the difference in the values in arising from the 
determination of an averaged second moment matrix for a 
smoothed image across a region dependent upon a selected 
value for are accounted for by making the normalised 

20 corner strength proportional to Os""*. 

Thus if two different sized regions in two images 
correspond to the same object taken from view points at 
different distances from the object the calculated 
25 normalised corner strengths for the same physical point 
on an object will be comparable. Therefore t)y selecting 
a set of feature points for further characterization on 
the basis of the calculated normalised corner strengths, 
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the same feature points can be selected regardless of the 
actual scale at which those features are detectable and 
hence the same features should be selected regardless of 
the apparent changes of size of an object due to changes 
5 of view point. 

The calculated normalised corner strength for a pixel is 
indicative of a relative measure of the extent to which 
a region of an image centred on a point is indicative of 

10 a corner. Where a pixel is associated with a normalised 
corner strength is greater than its neighbours, this 
indicates that the pixel corresponds most closely to a 
point which has the appearance of a corner. In order to 
identify those points within an image which most strongly 

15 correspond to corners, the detection module 72 compares 
calculated normalised corner strengths for each pixel 
with the calculated normalised corner strengths for the 
neighbouring pixels. In this embodiment this is achieved 
by the detection module 72 first determining (S20) 

2 0 whether normalised corner strengths have been stored for 
all the adjacent pixels in the region of the image for 
which the locations of normalised corner strength maxima 
are currently being determined. If the normalised corner 
strength has not yet been calculated for all adjacent 

25 pixels in this region of an image, the next pixel (S22) 
is selected and an average second moment matrix for that 
pixel and normalised corner strength is calculated and 
stored (S22, S14-S18). 
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When the detection module 72 determines (S20) that 
normalised corner strengths have been determined for all 
pixels in the current region for which the local corner 
strength maxima are to be calculated, the detection 
5 module 72 then determines (S24) which of the pixels 
correspond to local maxima of normalised corner strength. 
The co-ordinates of these local maxima are then stored in 
the memory 78 together with the associated normalised 
corner strength, the averaged second moment matrix 
10 calculated for that pixel, and the scale Og at which the 
corner was detected* 

When the local maxima for a region of an image have been 
determined, the detection module 72 then checks (S26) 

15 whether the region of the image for which corner 
strengths are currently being calculated corresponds to 
the last region of an image for which local corner 
strength maxima are determined. If the region of an 
image for which corner strengths are currently being 

20 determined is not the last region of an image for 
determining corner strength the detection module 72 then 
updates the areas of memory 7 8 storing data relating the 
normalised corner strengths for those pixels which are no 
longer necessary for determining the value for local 

2 5 maxima in the subsequent regions of the image to indicate 
that they may be reused and then calculates further 
normalised corner strengths (S28, S14-20) in the next 
region of the image and then determines and stores local 
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maxima of corner strength for that region (S24). 

The determination of local maxima region by region 
therefore enables data which is no longer necessary to 
5 determine local maxima to be overwritten and hence 
minimises the memory required for the determination of 
which pixels correspond to local maxima and hence are 
most representative of corners in the original image. 

10 If the detection module 72 determines (S26) that the 
pixels corresponding to local maxima of corner strength 
have been determined for all the pixels in the image the 
detection module 72 then (S30) determines whether the 
scale used for calculating smoothed images and average 

15 second moment matrices corresponds to the final scale 
where Og = 4 . If the scale does not correspond to the 
final scale the detection module 72 then selects (S32) 
the next largest scale for use to calculate a new 
smoothed image and a further set of local maxima of 

20 normalised corner strengths {S14-S30). 

In this embodiment of the present invention the scales 
used for setting the values of Og correspond to a set of 
scales where the value of for each scale is 

25 geometrically greater than the previous scale at a ratio 
of V2f with ranging between 0,5 and 4 i.e. = 0.5, 
0,707, 1, 1.414, 2, 2 . 828 and 4. The detection of 
features at a number of widely spaced scales ensures that 
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as far as possible different feature points are detected 
at each scale. In this embodiment scales greater than 4 
are not used as the processing required for generating 
smoothed images and average second moment matrices at 
such larger scales are relatively high and the smoothing 
at such large scales results in a loss of locality of 
feature points detected using such large scales. 

When corner strengths and the co-ordinates of local 
maxima of corner strengths have been calculated at all of 
the selected scales, the detection module 72 then (S34) 
filters the data corresponding to the local maxima 
detected on the basis of the normalised corner strengths 
for those pixels to select a required number of points 
which have the highest corner strength and hence are most 
strongly indicative of corners within the images. In 
this embodiment, which is arranged to process image of 
768 by 576 pixels, the top 400 points indicative of 
highest corner strengths determined at any of the seven 
scales with ranging between 0,5 and 4. 

When a desired number of feature points most strongly 
indicative of corners have determined by the detection 
module 72 the feature detection and characterizing module 
2 will have stored in RAM 78 a set of coordinates for the 
feature points, each having an associated scale at which 
the feature point has been detected and the averaged 
second moment matrix for a region of the smoothed image 
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centred on the feature point. In this embodiment, the 
control module 70 then invokes the characterization 
module 74 to generate a set of data characterizing the 
feature point in a way which is not significantly 
5 affected by viewing objects from different viewpoints as 
will now be described. 

O 

FEATURE CHARACTERIZATION 

In order to characterize feature points in a way not 
10 significantly effected by distortions arising from 
viewing objects from different view points, the 
characterization module 74 in this embodiment 
characterizes each of feature points on the basis of 
processed image data for a region centred on that feature 
15 point, the size of which is selected utilizing 
information indicative of the size of a feature which has 
been used to select the feature point which is then 
converted into an image of a fixed size. This has the 
effect of making the characterization substantially 
2 0 independent of the distance at which an image of an 
object is recorded. 

The resized image data is then processed to remove 
distortions arising from stretch and skew which result 
25 from viewing planar surfaces or surfaces which are 
approximately planar from different view points. The 
characterization module in this embodiment then generates 
a characterization vector utilising the processed image 
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data, comprising a set of values which are substantially 
independent of rotation of the processed image data which 
could arise either from rotations within the initial 
image data or from the processing to remove the effects 
5 of stretch and skew. 

Figure 8 is a flow diagram of the processing of the 
characterization module 74 to characterize a feature 
point selected by the feature detection module 72. The 
10 processing of Figure 8 is carried out for each of the 
feature points detected by the feature detection module 
72 so that all of the feature points are characterized in 
a way substantially independent of distortions resulting 
from viewing objects from different view points. 

15 

As an initial step (S40) for characterizing a feature 
point, the characterization module 74 selects a portion 
of an image, centred on the feature point to be used as 
an image patch to characterize that feature point. In 

20 this embodiment of the present invention, the 
characterization module 74 determines the size of this 
image patch used to characterize a feature point on the 
basis of the scale at which a feature point was detected 
by the detection module 72. In the present embodiment, 

25 the characterization module 74 is arranged to utilize an 
image patch for the characterization of a feature point 
centred on the feature point that is twice the size of 
the region of an image used to detect the presence of a 
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feature point. In this way a feature point is 
characterized by an image patch which necessarily 
includes the entirety of the feature detected by the 
feature detection module 72. By characterizing a feature 
5 point using an image patch centred on the feature point 
which is larger than the region of an image used to 
detect a feature, the inclusion of some additional image /f^ 
data is ensured which allows for the image to be 
transformed to account for stretch and skew as will be 
10 described in detail later. 

After the characterization module 74 has selected the 
size of an image patch centred on a feature point, on the 
basis of the scale associated with the feature which has 

15 been detected, the characterization module 74 then 
re-samples (S42) this image patch of the image to obtain 
a new image patch of fixed size. In this embodiment the 
size of the new image patch is set at 128 x 128 pixels. 
This resizing of the image patch is achieved by linear 

2 0 interpolation of values for pixels in the new image patch 
based upon the values of pixels in the original image 
patch. When a re-sampled image patch has been calculated 
this is stored in RAM 78. 

2 5 The feature characterization module 7 4 then calculates a 
transformation required to transform the resized image 
patch into an image patch in which the effect of stretch 
and skew have been removed. The second moment matrix for 
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an image patch comprises values which are indicative of 
the rate of change of grey scale values in the image 
patch along the x and y coordinates. The second moment 
matrix for an image patch is therefore indicative of how 
5 an image patch appears to be stretched and skewed, and 
can therefore be utilized to determine a transformation 
to remove the distortions resulting from stretch and skew 
which transform squares into parallelograms and circles 
into ellipses as will now be described. 

10 

Firstly, the characterization module 74 calculates (S44) 
a value for the square route of an averaged second moment 
matrix for the current image patch. In this embodiment, 
since a value for the averaged second moment matrix for 

15 a feature point is calculated and stored as part of the 
detection of feature points by the detection module 72 
for an initial iteration, this stored value for the 
averaged second moment matrix for a feature point on 
which the image is centred is utilised as the value for 

20 a calculated second moment matrix for an image patch 
centred on that feature point. For subsequent iterations 
an average second moment matrix for an image patch is 
calculated in the same way as has been described in 
relation to the calculation of second moment matrices by 

25 the detection module 72, 

When either a stored value for an averaged second moment 
matrix has been retrieved from memory, or a value for the 
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average second moment matrix for an image patch has been 
calculated directly from the image data for an image 
patch the square root of this averaged second moment 
matrix is then determined by calculating a Cholesky 
5 decomposition of the average second moment matrix. The 
Cholesky decomposition is the decomposition of the 
averaged second moment matrix M so that: g|\ 



10 where a = I^, and b and c are values determined by the 
Cholesky decomposition of the averaged second moment 



The characterization module 74 then determines (S46) if 
15 this calculated square root is equal to the identity 
matrix. If the square root of the second moment matrix 
for an image is equal to the identity matrix the image 
patch is already indicative of an image which has had the 
effect of stretch and skew removed and hence no further 
20 transformation is required. The characterization module 
then proceeds to characterize such an image by 
calculating a set of rotational invariants (S54) as will 
be described later. 

25 If the square root of the second moment matrix is not 




matrix . 
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equal to the identity matrix, the characterization module 
74 instead proceeds to calculate a transformed image 
corresponding to the image patch transformed by the 
square root of the second moment matrix for the image 
5 patch scaled by a scaling factor X where 

A = l/(DetM)** 

In this embodiment this transformed image patch is then 
10 generated (S48) by the characterization module 74 
determining the co-ordinates of points corresponding to 
origin of pixels in a transformed image and then 
calculating (850) pixel values on the basis of linear 
interpolation of a pixel value for these points utilising 
15 the distances and pixel values for the closest adjacent 
pixels in an original image, in a conventional manner. 



Thus for example where by applying the inverse of the 
square root of the averaged second moment matrix scaled 

20 by l/{detM)'' to a point corresponding to pixel at position 
Xi yi, the origin for that point is determined to be X2 y2. 
A value for the pixel at x^ yi in the transformed image is 
calculated by using the pixel values corresponding to the 
pixels which are closest to the point Xj Y2 in the 

25 original image to interpolate a calculated value for that 
point. A transformed image is then built up by 
calculating pixel values for each of the other points 
corresponding to pixels in the transformed image by 
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determining the origin for those pixels in an original 
image by applying the inverse square root scaled by 
l/(detM)'' and then calculating pixel values by 
interpolating a value for a pixel in the new image from 
5 the values for pixels adjacent to the origin for that 
pixel using linear interpolation. 

The characterising module 74 then determines (S52) 
whether a required number of iterations have been 

10 performed • In this embodiment the maximum number of 
iterations is set to be equal to two. If the required 
number of iterations is not equal to the maximum number 
of iterations which are to be performed the 
characterizing module 74 then proceeds to calculate the 

15 square root for the averaged second moment matrix for the 
transformed image patch and then generates a new 
transformed image utilizing this square root of the 
averaged second moment matrix for the image patch (S44- 
S52) . 

20 

If the characterization module 7 4 has performed the 
maximum number of iterations required or it has been 
established that after calculating a second moment matrix 
for an image patch that second moment matrix is equal to 
25 identity, the transformed image patch will then 
correspond either exactly or approximately to an image 
patch from which the effects of stretch and skew have 
been removed- The characterization module then proceeds 
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to calculate a set of rotational invariants (S54) to 
characterize the transformed image in a manner which is 
substantially independent of rotation of the transformed 
image as will be described in detail below. 

5 

As stated above the second moment matrix for an image 
patch is indicative of the rate of change of grayscale 
value across an image patch. Where one image patch 
corresponds to another image patch which has been 

10 stretched and skewed by an affine transformation if both 
of these image patches are transformed by the above 
described process so that the second moment matrix for 
both of the image patches is equal to identity the 
transformed image patches will correspond to each other 

15 subject to an arbitrary rotation provided the second 
moment matrix is calculated for what amounts to identical 
portions of an image. This correspondence arises as is 
explained in "Shape-adapted Smoothing in Estimation of 3- 
D Shape Cues from Affine Deformations of Local 2-D 

20 Brightness Structure", Image and Vision Computing, 15 
(1997) pp422-423 because of the relationship for a second 
moment matrix that: 

M(BJ) b'^M' ( J)B 

25 

where B is a transformation resulting in stretch and skew 
of an image patch, M(J) is an averaged second moment 
matrix for an image patch J, and M' (J) is the second 
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moment matrix for an image patch J for a region of an 
image J which corresponds to the image patch BJ . 

It then follows that if for two images J and J' which 
5 correspond to the same part of an image, M(J) = M( J' ) = I 
and J' BJ 

then I = M( J' ) 
= M(BJ) 
= B'^M' (J)B 
10 = B'^IB 

= b'^B which implies B is a rotation 
and hence J and J' are the same image subject to an 
arbitrary rotation B, provided J and J* correspond to the 
same portions of an image (i.e. J* = BJ). 

15 

In the present embodiment, the characterization module 74 
is arranged to transform an image patch by a number of 
transformations equal to the square root of an averaged 
second moment matrix scaled by a scaling factor equal to 

20 l/det{M)**. These transformations have the effect of 
transforming the original 128 x 128 image patches used to 
characterize a feature point to correspond to a distorted 
image patch in the original image. This amounts to an 
approximation which is equivalent to varying of the shape 

25 of the region used for selecting an image patch so that 
the image patches used to characterize feature points of 
an object appearing in images taken from different view 
points correspond to the same patches of the objects 
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appearing in each of the images. Therefore if the second 
moment matrix patch for such transformed images is equal 
to the identity matrix, the above relationship that 
transformed images will correspond subject to an 
arbitrary rotation will hold. It has been found that 
good matching results occur when only one or two 
iterations transform an image patch and hence in this 
embodiment the total number of iterations is limited to 
two. 

In this embodiment of the present invention after a 
transformed image patch for a feature point has been 
transformed to account for changes in scale, stretch and 
skew, this transformed image patch is then used to 
generate a characterization vector characterizing the 
feature point in a way substantially unaffected by 
distortions arising from changes of the appearance of an 
object by being viewed from different view points. This 
is achieved by generating a characterisation vector 
utilising calculated rotational invariants for the image 
patch as the combined result of processing a portion of 
an image to account for changes in scale, stretch skew 
and rotation is to characterise a point in a way 
substantially unaffected by distortions arising from 
changes of camera view point. 

To achieve this the characterisation module 7 4 in this 
embodiment is arranged to generate a characterization 
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vector utilizing values determined using a set of masks 
to calculate a set of complex coefficients comprising 
approximate determinations of 

where J(r,(p) is the transformed image centred on a 
feature point, F^{r) is set of a circular symmetric 
functions and 0 ^ n ^ n^^^, 0 ^ m ^ m^^^x- Specifically, in 
this embodiment, the characterisation module is arranged 
to calculate a set of nine complex coefficients 
comprising the values for U„,„, for an image where n^^x and 
m^ax 9-^^ equal to 2 • 

Under a rotation of an image: 

J' (r,cp) = J(r,(p+0) 

these complex coefficients undergo the following 
trans formation : 

By calculating the above set of complex coefficients a 
set of values unaffected by rotation of the image may 
therefore be determined since 
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1) Re(U,,o) = Re(e'Un,o) 

= Re(U\,o) for 0 ^ n ^ n^ax f ^11 0, 

where Re(z) is the real part of complex variable z; 

2) |Uo,J = le^^Xn. 1 

= \13\^^ I for 1 ^ m 5 m^ax for all 0; and 

3) U,,„U*o,./lUo,. I = e-^'-X^.VIUo,. I 

= U\,™ U'o,,VlU%,^ I for 1 ^ m ^ m^ax 

1 ^ n ^ n^ax 
for all 6. 

where U* is the complex conjugate of the complex variable 
U. 

Therefore the following values can be determined 
utilizing these complex variables which are unaffected by 
rotation of an image J(r,<t)). 

1, Re(U„,o) for 0 ^ n < n^^^ 

2. |Uo,n,| for 0 < m < m^^x 

3. Re(U„,, U*o,^/|Uo,J) for 1 ^ n < n.^x, 1 ^ m ^ m^ax 

4, Im(U„,„ U^,,/|Uo,„.| ) for 1 ^ n ^ n.^x, 1 ^ m ^ m^ax 

where 

Re(2) is the real part of complex variable z 
Im(z) is the imaginary part of complex variable z 
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and U* is the complex conjugate of the complex variable 
U. 

The calculation of approximations of: 

where J(r,(p) is a transformed image centred on a feature 
point and Fn{r) is a set of a circular symmetric function 
with 0^n^2 and 0^m^2 in this embodiment, is approximated 
by the sum of scaled pixel values for a transformed image 
patch with each of the combinations of pixels in the 
transformed image scaled by a scaling mask for each pair 
of n and m comprising a table of scaling factors. In 
this embodiment, a total of eighteen scaling masks are 
stored in memory and then used to calculate the 
approximations of the real and imaginary portions of Un,m 
with 0^n<2 and 0^m^2. Each of these masks comprise a 
stored 12 8 x 12 8 table of scaling factors where the 
scaling factors in each of the real masks correspond to 
calculated values for 

R 

where r and cp correspond to polar coordinates for a pixel 
at position x,y relative to the centre of an image patch 
and the scaling factors for each of the imaginary masks 
correspond to calculated values for 
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where r and cp correspond to polar coordinates for a 
pixel at a position x,y relative to the centre of an 
image patch. 

Thus in this way approximation of U^,™ for each of the 
values of n,m 0^n^2 and 0^m<2 can then be determined for 
a 12 8 X 128 transformed image since 

128 128 J. J 
x-O y=0 

where p(x,y) is the grey scale value of a pixel in a 
transformed 128 x 128 image patch at position x,y. 

The processing of the generation of a 

characterisation vector for a feature point by the 

characterization module 7 4 utilizing stored masks for 
calculating an approximation of Un,„ with 



will now be described with reference to figures 9A, 9B 
and 9C which comprise a flow diagram for the calculation 
of characterization vectors utilizing a stored set of 
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scaling masks and corresponds to step S54 in Figure 8 and 
also figures 10-13 which are illustrations showing the 
distribution of scaling factors for scaling masks- 

Initially (360) n and m are set to zero. The 
characterization module 74 then selects (S62) from the 
stored set of 12 8 x 128 masks a real mask for calculating 
the real value for Un,m- 

In this embodiment F^(r) for the determination is selected 
to be a set of n derivatives of a Gausian function with 
a standard deviation or proportional to the 12 8 x 128 
transformed image patch, with 0<n<,2. By utilising a 
function which decreases the further array from the 
centre of an image patch, calculated values for Un,m are 
most strongly dependent upon pixel values for the centre 
of an image patch and hence the characterization of a 
feature point is primarily dependent upon the portion of 
an image closest to the feature point. 

Figure 10 is an illustration of the distribution of 
scaling factors in an example of a mask for calculating 
Re(Uoo) where the scaling factor for points in the image 
is proportional to the colour of grey in the figure. 
Thus for Figure 10 which illustrates to a mask for 
calculating an approximation of the real value of: 
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^00 = {[G^(r)JMdrdip 

where G^{r) is a Gausian function with a standard 
deviation a proportional to the size of a transformed 
image of 128 by 128 pixels. 

In the case of Uoo, since this is a completely real 
variable, the calculation of the real portion of the 
variable is the same as the calculation of the value for 
Uoo itself. The mask for calculating the value of this 
coefficient therefore comprises a table of scaling 
factors, where the factors are arranged in a series of 
concentric circles where the scaled contribution of the 
image decreases exponentially from one in the centre of 
the image patch to zero towards the edge of the image 
patch in accordance with the distance of a pixel of the 
image patch from the centre of the image patch. Thus as 
illustrated in Figure 10 the small white circle at the 
centre of the mask corresponds to a positive scaling 
factor of one and the mid-grey at the edge of the mask 
corresponds to a scaling factor of zero* 

After the mask for Uoo has been selected a value for 
Re (Uoo) is calculated using the mask (S64) by summing the 
grey scale values of the transformed image patch, that is 
the image patch which has been transformed to remove the 
effect of stretch and skew where the contribution of each 
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pixel is scaled by a factor in accordance with the 
selected mask. In the case of Uoo/ this has the effect of 
calculating a characterization value for the image patch 
in a similar way for the calculation of the values for 
5 pixels in the smoothed image, as the characterization 
value for an image patch is equal to the sum of the grey 
scale values for each of the pixels in the image patch 
where the contribution of each pixel ±s scaled by a 
scaling factor where the scaling factor decreases 
10 exponentially with the distance from the centre of the 
image from one towards zero. The characterization module 
74 then causes the calculated value to be stored in 
memory 78. 

15 The characterization module 74 then selects (S66) an 
imaginary mask for calculating the imaginary portion of 
the complex variable under consideration. For complex 
variables other than Un,o a value for the imaginary portion 
of Un,m is calculated utilizing a selected mask and then 

20 stored (S68). 

In the case of Un,o since U„^o is an entirely real complex 
variable, the mask Im (U^^o) would scale all of the values 
for the image patch by zero. Thus in the case of Im (Un^o) 
25 the step of selecting an imaginary mask and calculating 
an approximation of the imaginary portion of Uo,o is 
omitted with the value zero merely being stored 
automatically . 
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The characterization module 74 then determines whether 
the current value of n is equal to the maximum value of 
m (in this embodiment 2), for which the complex variables 
U„,ro is to be calculated. 

If the characterization module 74 determines that the 
current value of n is less than the maximum value of n 
for which the complex variables Un^m are to be calculated 
the characterization module 74 then increments (571) the 
value of n and then utilizes the new value of n to select 
(S62) a different mask for calculating the estimate of 
the real portion of another complex variable* The 
characterization module 74 then selects (S64) another 
mask for the calculation of the imaginary portion of Un,^ 
(S66) which is calculated and stored (S68). When the 
imaginary portions of U^,^ have been stored the 
characterization module 74 then again determines whether 
the current value of n is equal to the maximum value of 
n (S70). 

When the characterization module 74 determines that the 
final value for n has been reached the characterization 
module then determines (S72) whether the current value of 
m is equal to the maximum value of m for which real and 
imaginary portions of U^,^ are to be calculated. In this 
embodiment the characterization module 7 4 checks whether 
m is equal to 2 as this is the greatest value of m for 
which U„,^ is calculated. If the value for m is not equal 
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to the maximum value of m the characterization module 74 
then increments the value of m and sets the value of n to 
zero to calculate a further set of complex variables for 
each value of n from zero to n^a^ (S62-S74). 

For each of the iterations for the calculation of values 
for Un,„ a different set of real and imaginary masks each 
comprising 128 by 128 tables of scaling factors is used 
for determining a scaling of the contributions from each 
of the pixels in the image patch to determine the 
approximate value for U„,„. Figures 11, 12A, 12B, 13A, 13B 
and 14A and 14B are illustrative examples of the 
arrangement of scaling factors within the 12 8 x 128 
tables for scaling the contributions of pixels at a 
corresponding position within the 128 x 128 image patch 
to calculate the values for U„,„ for different values of 
n and m. 

Figure 11 is an illustrative example of the arrangement 
of scaling factors within a 12 8 x 12 8 table for the 
calculation of Uj^o- m the case of the calculation of 

U2,0 

,2 

dr 



where Go(r) is a Gausian function with a standard 
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deviation o proportional to the size of the transformed 
image of 12 8, by 128 pixels. 

As is the case for all of the complex variables Un,o this 
is an entirely real variable. The imaginary portion of 
Un,o is therefore equal to zero. The real portion of the 
U2,o can be determined by calculating the sum of the grey 
scale values for pixels in an image patch scaled by 
scaling factors where the scaling factors are arranged as 
shown in Figure 11. 

In the case of U2,o as is shown in Figure 11, the variation 
in scaling factors is illustrated by varying shades of 
grey where white corresponds to a positive scaling factor 
of 1, black corresponds to the negative scaling factor of 
-1 and the mid grey at the edge of the figure corresponds 
to a value of zero. In the case of a mask for 
calculating the value of Re( U2,o) the scaling factors vary 
between -1 and 1. The scaling mask is such that the 
central portion of an image patch being scaled by a 
factor of -1, with an annulus further away from the 
centre of the image having a scaling factor of 1, with 
the scaling factor varying from -1 to 1 gradually as rt 
moves away from the centre towards this annulus. Beyond 
this annulus the scaling factor reduces from 1 to 0 
further away from the centre of the image patch. 

Figures 12A and 12B are exemplary illustrations of 
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arrangements of scaling factors within tables for masks 
for calculating the real and imaginary portions of Uo,i 
respectively. As in the case of Figures 10 and 11 these 
scaling factors are shown proportionateley as shade of 
grey in the Figure where black indicates a scaling factor 
of -1, white indicates a scaling factor of 1 and a mid 
grey at the edge of the figure indicates a scaling factor 
of zero with intermediate shades of grey being indicative 
of intermediate scaling factors. 

In the case of the real portion of Uo,i as is shown in 
Figure 12A, the scaling mask comprises two regions, one 
on the left hand side of the image patch where the 
contributions of pixels on that side of the image patch 
are scaled by negative scaling factors and a symmetrical 
region in the right hand side of the image patch where 
the contributions of pixels in that region of the image 
patch are scaled by a positive scaling factors 
proportional to the corresponding negative scaling 
factors of pixels in the left hand portion of the image. 

Figure 12B is an illustration of arrangements of scaling 
factors within a table for a mask for calculating the 
imaginary portion of Uo,i. The mask of figure 12B is 
identical to the mask of Figure 12A except that the mask 
is rotated about the centre of the image patch by 90'' so 
that a region of the image patch at the top of the patch 
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is scaled by a variety of negative scaling factors and a 
symmetrical of region of the image patch at the lower 
portion of the image is scaled by positive scaling 
factors . 

Figures 13A and 13B are illustrative examples of 
arrangements of scaling factors within tables for masks 
for calculating the real and imaginary portions of Uo,2- 
The masks indicate the scaling factors for different 
portions of an image in the same manner as Figures 10, 
11, 12A and 12B with white indicating a positive scaling 
factor of 1, black indicating a negative scaling factor 
of 1 and intermediate shades of grey indicating 
intermediate scaling factors with the mid grey at the 
edge of the figure indicating a scaling factor of zero. 

As can be seen from Figure 13A the mask for the scaling 
of contributions of an image to determine the value for 
the real part of Uo,2 comprises a pair of regions aligned 
along an axis running from the top left hand corner of an 
image patch to the bottom right hand corner of the image 
patch which scale the contributions of pixels in an image 
patch by positive factors and a pair of regions along an 
axis from the top right hand corner of the figure to the 
bottom left hand corner of the figure composing two 
regions in which the patch are scaled t>y negative scaling 
factors . 
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The scaling mask of figure 13B for determining the 
imaginary portion of Uo,2 comprises a similar arrangement 
of similar regions to that of Figure 13A in which the 
regions are arranged along axes rotated 45' anti clockwise 
relative to the orientations of the same regions in the 
mask for calculating the real portion of Uo,2 shown in 
Figure 13A, 

When the characterization module 74 has calculated all of 
the required values of Un,ni data representative of these 
values will be stored in memory 78. The characterization 
module 7 4 then proceeds to utilize these values to 
generate sequentially a characterization vector 
characterizing the sampled image patch as will now be 
described. 

In order to generate the characterization vector for a 
feature point the characterization module 74 initially 
sets the value of n to zero (S78). Uo,o which is an 
entirely real variable is then stored (S80) in memory 78 
as part of the characterization vector for the feature 
point for which the values of Un,m have been determined. 
The characterization module 74 then determines (S82) 
whether n is equal to n^^^ i.e. in this embodiment whether 
n = 2. If this is not the case the characterization 
module increments n (S84) and stores the value of Un,o fo^ 
the new value of n as the next value in the sequentially 
generated characterization vector for the feature point 
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(S80). In this way all of the values of Un,o for 0 ^ n ^ 
n^^^ are stored as part of the characterization vector for 
a feature point* 

When the characterization module 74 determines (S82) that 
n = n^ax/ the characterization module then sets n and m 
equal to 1 (S86) . The characterization module 7 4 then 
determines (S88) and stores in the memory 78 the value of 
the modulus of Uo,n. as the next value of the sequentially 
generated characterization vector for the feature point 
currently being processed, with the modulus of Uo,ni being 
determined from the value for the real and imaginary 
portions of Uo^n. stored in memory 78. 

The characterization module 74 then determines (390) a 
value for the complex conjugate of Uo,n from the values for 
Uo,m stored in memory 78 and determines from the values for 
the complex conjugate 0%,^ the value for 

where U*o,n, is the complex conjugate of Uo^m and |Uo,m I is 
the modulus of Uo.m- 

The characterization module 74 then determines (S92) and 
stores the real and imaginary portions of the product of 
Ur,,n» and U*o,^/|Uo,m I with the real and imaginary portions 
of this product being stored as parts of the sequentially 
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generated characterization vector for the feature point 
being processed. 

The characterization module 74 then determines (S94) 
whether the current value for n is equal to n^^x (i-e* in 
this embodiment does n = 2). If this is not the case the 
characterization module 74 then increments n (596) and 
calculates a further set of values for the real and 
imaginary portions of the product of Un,^ ^\,m/\^o,m 
1 utilizing this new value of n. In this way the product 
of Un,, and U*o,,/lUo,. I for all values of n are calculated 
and stored as part of the sequentially generated 
characterization vector for a feature point. 

When the characterization module 74 establishes that n - 
n^ax the characterization module then (S98) tests to 
determine whether m is m^^x- m this embodiment this means 
the characterization module 74 tests to determine whether 
m = 2. If m is not equal to ra^^x the characterization 
module 74 increments m (SlOO) and resets n to 1 and then 
proceeds to calculate and store as parts of the 
characterization vector for a feature point a modulus of 
Uo,m utilizing the new m and the products of U*o,i„/|Uo,m I 
and U,,. with l^n^ n,,, (S88-S96). In this way the 
characterization module generates a characterization 
vector utilizing the values for U^,^ in a way which 
generates values which are substantially independent of 
rotation of images in the transformed image patch. 



57 



2641701 



Thus for example in the present embodiment where m^^^ and 
m,,, are both equal to 2 the generated characterization 
vector comprises the following thirteen values: 

Uo,o, Ui^o/ U2,o |Uo,i 1. Re(Ui,iVo,i) 
Im(Ui,iVo,i) , Re(V2,Vo,i), Im(U2,Vo,i) 
|Uo,2 U Re(Ui,2Vo,2) Im(Ui,2Vo,2) r 
Re(U2,2Vo,2). Im{U2,2Vo,2) 
where Vo,i = U*o^i/|Uo,i I and 

Vo,2 = U*o,2/|Uo,2 I 

all of which are substantially independent of rotation of 
a transformed image patch. 

As the selection and processing of an image patch for the 
characterization of a feature point generates an image 
patch for a feature point which is substantially 
independent of distortions arising from changes in scale 
and distortions of stretch and skew arising from changes 
of view point, the combined result of selecting an image 
patch, processing the patch and characterizing a 
transformed image patch in a way which is substantially 
independent of rotation, is to generate a 
characterization vector for a feature point which is 
substantially independent of distortions arising from 
changes of camera view point. 
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MATCHING MODULE 

When all the feature points of a pair of images have had 
characterization vectors generated for them in the manner 
described above the control module 70 then invokes the 
matching module 7 6 to determine which feature points in 
one image are most likely to correspond to the feature 
points in the second image, utilising these 
characterization vectors • As the characterization 
vectors for feature points are substantially independent 
of distortions arising from changes of camera view point 
the matching of feature points between pairs of images 
should result in the matching of points corresponding to 
the same physical point on an object in a pair of images 
of that object taken from different view points. 

Figure 14 is a flow diagram of the processing of the 
matching module 76. Initially (SI 10) in order to remove 
systematic correlations between the characterization 
vectors for the feature points, a covariance matrix for 
the characterization vectors is calculated in a 
conventional manner. New characterization vectors are 
then calculated for the feature points in the images 
where the new characterization vectors for feature points 
are determined from the previously calculated 
characterization vectors which are multiplied by the 
square root of the covariance matrix for the 
characterization vectors. All of these new 
characterization vectors are then stored in memory 78. 



59 



2641701 



The calculation of the new set of characterization 
vectors has the effect of generating a set of normalised 
characterization vectors, normalised to remove systematic 
correlations between the values of the vector which arise 
because of systematic correlations within the original 
image data. 

The matching module 76 then (S112) determines how closely 
normalised characterization vectors for points in one 
image correspond to characterization vectors for points 
in another image. The correspondence between vectors is 
determined by calculating the square of the Euclidean 
distances between each of the normalised characterization 
vectors for features points in one image to each of the 
normalised characterisation vectors for points in the 
other image. These squares of Euclidean distances are 
indicative of the square of Mahalanobis distances between 
the characterization vectors originally calculated by the 
characterization module 74 for feature points in the 
images, since the Mahalanobis distance between two 
vectors XiXj is defined by: 

d{xSx^) = sqrt ((x^ - x^)''C-' (x^- x^)) 

where C is the covariance matrix for the data. 



The matching module 78 then determines (SI 14) for each of 
the normalised characterization vectors of feature points 
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The ambiguity score calculated by determining a ratio 
between the most closely corresponding and second most 
closely corresponding normalised characterization vectors 
for points in the second image is indicative of the 
5 ambiguity of the best match for the point in the first 
image to a point in the second image. Where the 
ambiguity score is significantly less than one this 
indicates that the best candidate match for a point in 
the second image for matching to a point in the first 

10 image is characterized in a way in which it is clearly 
closer to the characterization of the feature point in 
the first image than any other point in the second image. 
Where the ambiguity score is close to one this indicates 
that there are alternative matches for a feature point in 

15 the first image whose characterization vectors are almost 
as good as a match as the feature point which most 
closely matches the characterization vector of the 
feature point in the first image. 

20 By selecting the matches for pairs of images on the basis 
of selecting the least ambiguous matches the points which 
are matched are least likely to be incorrectly matched. 

Thus for example in Figure 2A portions of images about 
25 points in the first image 20 corresponding to windows 
24,26,28,30 are very similar and hence characterization 
vectors generated for these points would also be very 
similar. After a transformation resulting from a change 
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error is substantial. Thus where a large number of 
equally likely candidates for a matching exist it is 
preferable to ignore that potential match, regardless of 
how strong it might be. 

5 

Thus in this embodiment, when ambiguity scores have been 
determined for the potential matches for each of the 
points in the first image the matching module 76 then 
selects (S118) from the list of matches the matches which 

10 have the lowest ambiguity scores. Selecting the matches 
having the lowest ambiguity scores ensures that matches 
which are selected are most likely to correspond to 
unique portions of images and hence are most likely to 
correspond to the same point on an object in images of an 

15 object taken from different view points. The matching 
module 76 then outputs (S120) a list comprising pairs of 
coordinates for the points in the first image having the 
lowest ambiguity scores and the corresponding points in 
the second image whose characterization vectors most 

20 closely correspond to those points. This list of 
coordinates being those points in the images which 
correspond to the same physical points on an object 
appearing in those images. This list of matched feature 
points is then output to the output buffer 62 and is then 

25 made available for example by being sent to the camera 
position calculation module 6 in the form of an 
electrical signal or by being output on a disc for 
further processing by the camera position calculation 
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in an image then proceeds utilizing this grey scale image 
in the manner previously described. Thus in this way the 
points within the colour image corresponding to corners 
is determined, 

5 

The characterization module 74, in this embodiment is 
arranged to select and transform image patches of the 
colour image associated with feature points in the same 
way as is described in relation to the first embodiment 
10 to establish transformed colour images associated with 
feature points which are transformed to account for the 
effect of stretch and skew. 

However, in contrast to this previous embodiment, the 
15 characterization module 74 is then arranged to determine 
a set of complex coefficients utilizing scaling masks as 
has previously been described to obtain scaled sums of 
each of the individual red, green and blue components of 
the pixels for the transformed image patches. This is 
20 achieved in the same manner as has been described in 
relation to the calculation of complex coefficients for 
a grey scale image with each of the red, green and blue 
channels being treated as a separate grey scale image. 
The characterization module 74 then calculates the 
2 5 following values for an image patch which are independent 
of the rotation of image data for that image patch: 
Re(U^^o) 0 < n < n,,, 
ReCU'^^^o) 0 < n < n^^^ 
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minimised. These errors arise because the values for the 
complex coefficients are calculated by approximation of 
integrations by calculations of scaled sums. Since only 
the argument of some complex variables are used to 
5 account for variations arising due to rotation, the most 
reliable complex variable to use will have the largest 
modulus, as the argument for this complex coefficient 
will be least effected by small variations in the values 
of the calculated values for its real and imaginary parts 
10 arising due to approximations. 

When all of these values for the characterization of an 
image patch have been determined the matching module 76 
then utilizes characterization vectors including all of 
15 these values for matching one point in an image to its 
best match in a second image. Thus in this way the 
additional data available in a colour image can be used 
to increase the data which can be used to match points in 
dif f erent images , 

20 

THIRD EMBODIMENT 

Although in the previous embodiments the present 
invention has been described in the context of a feature 
detection and characterization module 2 for a system for 
2 5 generating three-dimensional computer models from images 
taken from different viewpoints, the present invention 
may also be used in a number of other ways. In this 
embodiment of the present invention the detection and 
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characterizations for index images stored in the database 
300. The matching module 76 then determines which of the 
stored images best matches the input image by selecting 
the image having the greatest number of unambiguous 
5 matches . 

Thus in this way the matching module 7 6 determines which 
of the images having characterisation values stored in 
the database 300 most closely corresponds to the image 

10 received in the image buffer by determining the best 
matches between characterized feature points for an image 
in the image buffer and each of the images in the 
database and then on the basis of those matches 
determining which of the images in the database 300 most 

15 closely corresponds to the image in the image buffer 60. 
The CPU 64 then retrieves a copy of the image in the 
database 300 and outputs the retrieved image for 
comparison with the input image. Thus by characterising 
the image received by the image buffer 60 in the way 

20 previously described a similar image stored in the 
databases 300 may be retrieved and output from a 
database - 

FOURTH EMBODIMENT 
25 In the processing of the previous embodiment an input 
image was characterised and the characterisation of the 
image was then compared to a database of images each of 
which had previously been characterised to retrieve from 
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stored values for an original image therefore identifies 
whether an image input into the image buffer 60 is a copy 
of an earlier image whose characterisation is stored in 
the database 300, In particular, by deliberately 
5 introducing certain features into an image which will 
result in the output of certain predefined 
characterization values following the analysis of the 
image by a feature detection and characterization module, 
a means is provided which enables the identification of 
10 the origin of subsequent copies of those images. 

FIFTH EMBODIMENT 

In the previous embodiment the present invention has been 
described in terms of apparatus for identifying and 

15 characterizing feature points matching those feature 
points with similarly characterized feature points either 
in other images or against a database of previously 
characterized images. In this embodiment of the present 
invention apparatus is provided which is arranged to 

2 0 remove the effects of stretch and skew from an image and 
to output an image transformed to account for the effect 
of stretch and skew. 

Figure 16 is a block diagram of apparatus in accordance 
25 with the fifth embodiment of the present invention. The 
apparatus in accordance with this embodiment of the 
present invention is identical to the feature detection 
and matching module 2 of the first embodiment except that 
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the output buffer 62. 

In this way by transforming an image in the image buffer 
60 by a number of iterations utilizing the square root of 
5 a calculated second moment matrix for the image an output 
image is generated which corresponds to the original 
image transformed to a skew normalised frame. In this 
way a number of images taken from different view points 
which introduce a skew into an image can be transformed 
10 to images where this skew is removed so that the 
different images with the skew removed may be compared * 

FURTHER AMENDMENTS AND MODIFICATIONS 

15 In the previous embodiments the detection module 72 has 
been described which is arranged to identify feature 
points in images corresponding to corners on objects in 
the images. However, the detection module 72 could be 
arranged to detect alternative features . Thus for 

20 example instead of calculating normalised corner 
strengths (where a value representative of a strength of 
a corner is determined and scaled in accordance with the 
size of the portion of an image used to detect a corner 
strength), other values representative of some features 

2 5 in an image with these values being scaled to account, for 
the variation in such values arising due to the size of 
the region. Suitable features which might be detected 
could include points indicative of high curvature such as 
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feature points of interest. The detection of feature 
points using larger windows larger than 14 by 14 for this 
size of image has not been found to improve the ability 
of a feature detection and matching module 2 to match 
5 features more accurately ♦ This is due to the Increased 
computational complexity required for calculating 
smoothed values over such a large region and the fact 
that the determination of a feature point utilizing such 
a large region is not sufficiently specific to enable a 
10 detected feature point to be accurately matched with 
other points in other images . 

In the detection module 72 described above the selection 
of feature points for subsequent processing is described 

15 in terms of selecting a desired number of feature points. 
However the normalised feature strength determined by 
the detection module 72 could itself be used to filter a 
list of potential feature points with only those feature 
points having a normalised feature strength greater than 

20 a set threshold being utilized in subsequent processing. 
The advantage of utilizing a threshold to select those 
features which are selected for future processing is that 
this ensures only those features having particularly 
strong feature detection values are subsequently 

25 processed. 

In the previous embodiments the characterization module 
74 has been described arranged to characterize a feature 
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In the case of such rotational invariants a suitably 
shaped image patch to characterize a point utilizing 
rotational invariants would be a circular image patch. 
By making the shape of a selected image patch dependent 
5 upon the manner in which an image patch is to be 
characterized, a means is provided to ensure that a 
feature point is characterized in to generate 
characterization values invariant for distortions for 
which characterization values are calculated. The size 
10 of this image patch could then be arranged to be selected 
on the basis of a scale associated with a detected 
feature point . 

Although in the above described embodiments one way of 
15 associating a scale with a feature point has been 
described where the strength of the feature point is 
reduced proportionately to account for the different 
sizes of regions utilized to detect the feature point, 
other ways associating a detected feature point for this 
20 scale could be used. Thus for example where features are 
detected at a number of different scales a 'scale space* 
maximum could be determined in a manner suggested by 
Lindeberg in ' Scale Space Theory in Computers ' , Kluwer 
Academic, Dordrecht, Netherlands, 1994. This suggests 
25 that by detecting the strength of feature points across 
a range of scales, a scale which associates a point most 
strongly with a calculated feature strength can be 
determined. The scale associated with such "scale space 
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In the embodiments above the processing performed is 
described in terms of a CPU using processing defined by 
programming instructions. However, some or all, of the 
processing could be performed using hardware* 



81 

1.2.2 Calculate edge strengths an d directions 



The edge strengths and directions are calculated using 
the 7x7 integrated directional derivative gradient 
5 operator discussed in section 8.9 of Haralick and 
Shapiro^. 

The row and column forms of the derivative operator are 
both applied to each pixel in the grey scale image. The 
10 results are combined in the standard way to calculate the 
edge strength and edge direction at each pixel. 

The output of this part of the algorithm is a complete 
derivative image. 

1.2.3 Calculate edge boundaries 

The edge boundaries are calculated by using a zero 
crossing edge detection method based on a set of 5x5 
20 kernels describing a bivariate cubic fit to the 
neighbourhood of each pixel. 

The edge boundary detection method places an edge at all 
pixels which are close to a negatively sloped zero 
25 crossing of the second directional derivative taken in 
the direction of the gradient, where the derivatives are 
defined using the bivariate cubic fit to the grey level 
surface. The subpixel location of the zero crossing is 
also stored along with the pixel location. 

30 



The method of edge boundary detection is described in 
more detail in section 8.8.4 of Haralick and Shapiro"^. 
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correlation based matcher to make the measurements of 
corner correspondences. 

The method assumes that the motion of corners is smooth 
5 enough across the sequence of input images that a 
constant velocity Kalman filter is useful, and that 
corner measurements and motion can be modelled by 
gaussians - 



10 2.2 Algorithm 

1) Input corners from an image. 

2) Predict forward using Kalman filter. 

15 

3) If the position uncertainty of the predicted corner 
is greater than a threshold, A, as measured by the 
state positional variance, drop the corner from the 
list of currently tracked corners • 

20 

4) Input a new image from the sequence. 



5) For each of the currently tracked corners: 



25 a) search a window in the new image for pixels 

which match the corner; 
b) update the corresponding Kalman filter, using 
any new observations (i.e. matches). 



30 



6) 



Input the corners from the new image as new points 
to be tracked (first, filtering them to remove any 
which are too close to existing tracked points). 
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This uses the positional uncertainty (given by the top 
two diagonal elements of the state covariance matrix, K) 
to define a region in which to search for new 
measurements (i.e. a range gate). 

5 

The range gate is a rectangular region of dimensions: 



The correlation score between a window around the 
previously measured corner and each of the pixels in the 
range gate is calculated. 

15 The two top correlation scores are kept. 

If the top correlation score is larger than a threshold, 
Co, and the difference between the two top correlation 
scores is larger than a threshold, AC, then the pixel 
20 with the top correlation score is kept as the latest 
measurement . 

2.2.3 Update 

25 The measurement is used to update the Kalman filter in 
the standard way: 
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The algorithm's behaviour over a long sequence is anyway 
not too dependent on the initial conditions . 

The process velocity variance is set to the fixed value 
5 of 50 (pixels/frame)^- The process velocity variance 
would have to be increased above this for a hand-held 
sequence. In fact it is straightforward to obtain a 
reasonable value for the process velocity variance 
adaptively * 

10 

The measurement variance is obtained from the following 
model: 

- (rK^a) .-A-13 

where K = v^(KuK22) is a measure of the positional 
15 uncertainty, "r" is a parameter related to the likelihood 
of obtaining an outlier, and "a" is a parameter related 
to the measurement uncertainty of inliers. "r" and "a" 
are set to r=0.1 and a=1.0. 

2 0 This model takes into account, in a heuristic way, the 
fact that it is more likely that an outlier will be 
obtained if the range gate is large. 

The measurement variance (in fact the full measurement 
25 covariance matrix R) could also be obtained from the 
behaviour of the auto-correlation in the neighbourhood of 
the measurement. However this would not take into 
account the likelihood of obtaining an outlier. 
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The remaining parameters are set to the values: A-400 
pixels% Co=0-9 and AC-0.001. 
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process . 

Finally the triangulation is textured, using 
appropriate parts of the original images to provide 
the texturing on the triangles, 

3.2 Se gmentation 

The aim of this process is to segment an object (in front 
of a reasonably homogeneous coloured background) in an 
image using colour information. The resulting binary 
image is used in voxel carving. 

Two alternative methods are used: 

Method 1: input a single RGB colour value 
representing the background colour - each RGB pixel 
in the image is examined and if the Euclidean 
distance to the background colour (in RGB space) is 
less than a specified threshold the pixel is 
labelled as background (BLACK) . 

Method 2: input a "blue" image containing a 
representative region of the background. 

The algorithm has two stages: 

(1) Build a hash table of quantised background colours 

(2) Use the table to segment each image. 
Step 1) Build hash table 
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Step 2) Segment each image 

Go through each RGB pixel, "v" , in each image* 

Set "w" to be the quantised version of "v" as before. 

To decide whether "w" is in the hash table, explicitly 
look at all the entries in the bin with index h(w) and 
see if any of them are the same as "w" . If yes, then "v" 
is a background pixel - set the corresponding pixel in 
the output image to BLACK. If no then "v" is a 
foreground pixel - set the corresponding pixel in the 
output image to WHITE. 

Post processing: for both methods a post process is 
performed to fill small holes and remove small isolated 
regions . 

A median filter is used with a circular window. (A 
circular window is chosen to avoid biasing the result in 
the x or y directions.) 

Build a circular mask of radius "r". Explicitly store 
the start and end values for each scan line on the 
circle. 

Go through each pixel in the binary image. 

Place the centre of the mask on the current pixel. Count 
the number of BLACK pixels and the number of WHITE pixels 
in the circular region. 

If (ttWHITE pixels k #BLACK pixels) then set corresponding 



Voxel carving is described further in "Rapid Octree 
Construction from Image Sequences" by R. Szeliski in 
CVGIP: Image Understanding, Volume 58, Number 1, July 
1993, pages 23-32. 

3 . 4 Marching cubes 

The aim of the process is to produce a surface 
triangulation from a set of samples of an implicit 
function representing the surface (for instance a signed 
distance function). In the case where the implicit 
function has been obtained from a voxel carve, the 
implicit function takes the value -1 for samples which 
are inside the object and +1 for samples which are 
outside the object. 

Marching cubes is an algorithm that takes a set of 
samples of an implicit surface (e-g. a signed distance 
function) sampled at regular intervals on a voxel grid, 
and extracts a triangulated surface mesh. Lorensen and 
Cline^^^ and Bloomentahl^"" give details on the algorithm 
and its implementation. 

The marching-cubes algorithm constructs a surface mesh by 
"marching" around the cubes while following the zero 
crossings of the implicit surface f(x)=0, adding to the 
triangulation as it goes. The signed distance allows the 
marching-cubes algorithm to interpolate the location of 
the surface with higher accuracy than the resolution of 
the volume grid. The marching cubes algorithm can be 
used as a continuation method (i.e. it finds an initial 
surface point and extends the surface from this point). 
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plane of each triangle 
If (distance < threshold) 

Discard V and keep new trlangulation 

Else 

Keep V and return to old trlangulation 

OUTPUT 

Output list of kept vertices 
Output updated list of triangles 

The process therefore combines adjacent triangles in the 
model produced by the marching cubes algorithm, if this 
can be done without introducing large errors into the 
model . 

The selection of the vertices is carried out in a random 
order in order to avoid the effect of gradually eroding 
a large part of the surface by consecutively removing 
neighbouring vertices • 
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3 . 6 Further Surface Generation Techniques 

25 Further techniques which may be employed to generate a 3D 
computer model of an object surface include voxel 
colouring, for example as described in "Photorealistic 
Scene Reconstruction by Voxel Coloring" by Seitz and Dyer 
in Proc. Conf. Computer Vision and Pattern Recognition 

30 1997, pl067-1073, "Plenoptic Image Editing" by Seitz and 
Kutulakos in Proc. 6th International Conference on 
Computer Vision, pp 17-24, "What Do N Photographs Tell Us 
About 3D Shape?" by Kutulakos and Seitz in University of 



appear to be much of a problem. 



It has been found that, if every image is used for 
texturing then this can result in very large VRML models 
being produced. These can be cumbersome to load and 
render in real time. Therefore, in practice, a subset of 
images is used to texture the model. This subset may be 
specified in a configuration file. 
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said characterization means is arranged to determine the 
shape of a region to be used to generate characterization 
data for a feature on the basis of values of image data 
for a region of said image including said feature so that 
said characterization is substantially uneffected by 
transformations resulting in linear distortions of said 
region of said image* 

4. Apparatus in accordance with claim 1, wherein said 
characterisation means comprises: 

means for determining the rate of change of 
luminance along two axes for a said region of said image; 

means for determining a transformed image utilizing 
said rates of change of luminance; and 

means for generating characterization data 
characterizing a said region of said image utilizing said 
trans formed image • 

5. Apparatus in accordance with claim 3 or claim 4, 
wherein said characterization means comprises: 

means for determining for a said region an averaged 
second moment matrix for a feature, wherein said averaged 
second moment matrix comprises a scaled sum of second 
moment matrices for each pixel in said region, and said 
second moment matrices for each of said pixels comprises: 



factor * 
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7. Apparatus in accordance with claim 6, wherein said 
scaling factor is inversely proportional to the square 
root of the determinant of the averaged second moment 
matrix for a said region. 

8. Apparatus in accordance with claim 7, wherein said 
transformation means is arranged to iteratively generate 
transformed image data for a said region of said image 
until the calculated second moment matrix for said 
transformed image is equal to identity, and wherein said 
characterisation means is arranged to characterize said 
feature on the basis of said iteratively transformed 
image data. 

9. Apparatus in accordance with any preceding claim 
further comprising matching means for identifying matches 
between features in pairs of images, wherein said 
matching means is arranged to determine a match between 
features in pairs of images on the basis of 
characterization by said characterization means of 
features in said pair of images. 

10. Apparatus in accordance with claim 8, further 
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input means for receiving image data; 
transformation calculating means for determining a 
transformation to remove affine distortions from image 
data received by said input means; and 
5 transformed image generation means for generating 

transformed image data corresponding to image data 
received by said input means transformed by said 
transformation determined by said transformation 
determination means ; 

10 wherein said transformation calculating means is 

arranged to determine the transformation to remove the 
effects of affine distortions from the received image 
data by determining a transformation such that the second 
moment matrix for said image transformed by said 

15 transformation is substantially equal to the identity 
matrix. 

13. Apparatus in accordance with claim 12, wherein said 
transformation calculating means is arranged to determine 

2 0 said transformation by determining the square root of a 
second moment matrix for image data received by said 
input means . 

14. Apparatus in accordance with claim 13, wherein said 
25 transformation calculating means is arranged to calculate 
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determination means determines that the second moment 
matrix for said transformed image is not substantially 
equal to identity. 

5 17- Apparatus in aceordance with claim 16, wherein said 
second moment determination means is arranged to 
determine the rate of change of luminance along two axes 
for a transformed matrix and to determine the second 
moment matrix for a transformed image utilizing said 

10 rates of change of luminance, 

18* A method for generating characterization data 
characterizing an image comprising the steps of: 

receiving image data representative of an image; 

15 detecting a plurality of features in said image; and 

generating characterization data, characterising 
said features, by generating data characterising portions 
of said image data representative of regions of images 
including said features, wherein said generation step is 

20 such that said characterization data generated is 
substantially unaffected by transformations resulting in 
linear distortions of said regions including said 
features . 

25 19. A method in accordance with claim 18, wherein said 
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said characterization step comprises the steps of: 

determining for a said region of an image including 
a feature an averaged second moment matrix for said 
feature, wherein said averaged second moment matrix 
comprises a scaled sum of second moment matrices for each 
pixel in said region, and said second moment matrices for 
each of said pixels comprises: 



M = 



I// 



where I,, and ly are values indicative of the rate of 
10 change of luminance of an image along two different axes; 
and 

determining for said region of said image including 
said feature a transformed image transformed to account 
for distortions arising from sketch and skew on the basis 
15 of said second moment matrix determined for said region; 
and 

calculating characterisation values for a feature on 
the basis of the calculation of rotational invariants 
determined for said transformed image. 

20 

23. A method in accordance with claim 22, wherein the 
determination of a transformed image comprises 
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transformed image data for a said region of said image 
until the calculated second moment matrix for said 
transformed image is substantially equal to identity, and 
said characterization comprises means characterizing said 
5 feature on the basis of said iteratively transform_ed 
image data* 

27. A method of identifying correspondences between 
features in pairs of images, comprising the steps of: 
10 generating characterization data for images in 

accordance with any of claims 18 to 26; and 

determining a match between features in pairs of 
images utilizing said characterization data. 

15 28. A method in accordance with claim 27 further 
comprising the step of generating a signal conveying 
information defining said correspondences - 

29, A method in accordance with claim 28, further 
20 comprising the step of recording said generated signal on 

a recording medium either directly or indirectly. 

30. A method for generating a three-dimensional model 
from images of objects comprising the steps of: 

25 identifying the correspondence between features in 
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image data on the basis of the interpolation of values 
for image data representative of the origins of pixels in 
the transformed image transformed by the inverse square 
root of a second moment matrix determined for said image 
stored in said storage means multiplied by a scaling 
factor inversely proportional to the square root of the 
determinant of said second moment matrix. 

33. A method in accordance with claim 32, further 
comprising the steps of determining a second moment 
matrix for a transformed image and generating further 
transformed image data from said transformed image data, 
if the second moment for said transformed image is not 
substantially equal to identity. 

34. A method in accordance with claim 33, wherein said 
second moment matrix for a transformed image is 
determined by the steps of: 

determining the rate of luminance along two axes for 
a said transformed image, and determining said second 
moment matrix utilizing said rates of change of 
luminance . 

35. In an apparatus for generating a three-dimensional 
computer model of an object by processing images of the 



images comprxsing: 

storing image data; 

detecting the presence of features in the images 
represented by stored image data, 

generating characterization data for said features 
in the images in a manner substantially unaffected by 
linear distortions of regions of said images including a 
said feature; and 

matching features in different images utilizing said 
generated characterization data. 

37. In a method for generating a three-dimensional 
computer model of an object by processing images of the 
object taken from a plurality of different viewpoints to 
match features in the images, calculating the viewpoints 
at which the images were recorded using the matched 
features, and generating a three-dimensional computer 
model of the surface object using the calculated 
viewpoints, an improvement comprising matching features 
in the images by: 

storing image data; 

detecting the presence of features in the images 
represented by stored image data, 

generating characterization data for said features 
in the images in a manner substantially unaffected by 
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ABSTRACT 

TMAfiE PRfacKSSING MRTHOD AN H APPARATUS 

/ 

An apparatus (2) for matching features in images of 
objects taken from different viewpoints is provided 
5 comprising: an image buffer (60) for receiving image 

data; and output buffer (62) for outputting pairs of 
matched features and processing means (64-78) for 
processing received image data to determine matched pairs 
of features in images. The processing means (64-78) 
10 includes a detection module (72) for detecting features 

at a number of different scales to account for the 
possibility that a feature in one image may correspond to 
a larger or smaller feature in another image; a 
characterization module (74) for generating 
15 characterization data for selected features where the 

characterization data is substantially independent of 
changes of scale, and the effects of stretch and skew 
resulting from viewing objects from different viewpoints; 
and a matching module (76) for outputting as pairs of 
20 matched features, features which most closely correspond 

to each other which are unambiguously better matches than 
any alternative match between features in different 
images • 

Refer to Figure 5 
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